-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[v1.x] Backport #17702 and #17872 to v1.x branch #18038
Conversation
…7702) * Support projection feature for LSTM on CPU * test solution for -Werror=maybe-uninitialized * Check device type when create state * Document the projection feature of LSTM for RNN operator * Minor fix * Re-run CI
…che#17872) * Fix issue of zeros gradients w.r.t. RNN bias when num_layers > 1 * Use nd.copy() to initialize parameters of new operator * Add check for output states * Initialize i2h/h2h_weights with zeros for rnn_relu/tanh, and reduce size * Split fused rnn layer test into tests of individual mode * Skip lstm and gru tests on CPU context without DNNL
Hey @zixuanweeei , Thanks for submitting the PR
CI supported jobs: [clang, edge, unix-gpu, unix-cpu, centos-cpu, windows-cpu, centos-gpu, windows-gpu, website, miscellaneous, sanity] Note: |
Thanks @zixuanweeei, adding this to 1.7.0 roadmap #16864 |
@mxnet-bot run ci [centos-gpu, unix-gpu] |
Jenkins CI successfully triggered : [centos-gpu, unix-gpu] |
Unauthorized access detected. |
@zixuanweeei Thanks. I saw the PR is for branch v1.x. Would v1.7x have this PR? |
Please include Chai's recent fix: #18018 |
@stu1130 all 1.x commits made prior to a set date will be included in 1.7. |
Here : #18044 |
@mxnet-bot run ci [unix-gpu] |
Jenkins CI successfully triggered : [unix-gpu] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merging now. |
…18038) * Support projection feature for LSTM on CPU (Only Inference) (apache#17702) * Support projection feature for LSTM on CPU * test solution for -Werror=maybe-uninitialized * Check device type when create state * Document the projection feature of LSTM for RNN operator * Minor fix * Re-run CI * Fix issue of zeros gradients w.r.t. RNN bias when num_layers > 1 (apache#17872) * Fix issue of zeros gradients w.r.t. RNN bias when num_layers > 1 * Use nd.copy() to initialize parameters of new operator * Add check for output states * Initialize i2h/h2h_weights with zeros for rnn_relu/tanh, and reduce size * Split fused rnn layer test into tests of individual mode * Skip lstm and gru tests on CPU context without DNNL
Description
As title. #17702 and #17872 revised same lines in
test_gluon_rnn.py
. So we need to backport them in one. @ciyongch @TaoLv @pengzhao-intelAlso cc @stu1130.